Protecting user data during audio interactions

ABSTRACT

A method for protecting user data during an audio interaction includes various operations performed by a processing system including at least one processor. In one example, the operations include detecting an audio signal that is part of an interaction between a user and another party, converting the audio signal into a string of text, detecting that the interaction is likely to put sensitive data of the user at risk, based on a comparison of the string of text to a library of interactions that are known to put sensitive data at risk, and sending an alert to notify the user that the interaction is likely to put the sensitive data of the user at risk, wherein the alert is sent to prevent the user from providing the sensitive data to the another party, and wherein the method is performed contemporaneously with an occurrence of the interaction.

This application is a continuation of U.S. patent application Ser. No.16/921,854, filed Jul. 6, 2020, now U.S. Pat. No. 11,349,983, which isherein incorporated by reference in its entirety.

The present disclosure relates generally to data security, and relatesmore particularly to devices, non-transitory computer-readable media,and methods for protecting user data during audio interactions.

BACKGROUND

Fraud costs consumers billions of dollars each year, collectively.Moreover, an individual victim of fraud may spend much time trying torepair the non-financial damage of the fraud, such as replacingcredentials and equipment, resetting access to accounts, and the like.For instance, a consumer may receive an email, a phone call, or even anin-person solicitation from a person claiming to have some legitimateneed for the user's financial information such as a credit card number.If the person is not who they claim to be, however, the consumer may endup having to pay for purchases he did not make or authorize. Theconsumer may also spend a great deal of time and effort disputingfraudulent charges to his credit card, obtaining a new credit card witha new credit card number, and updating the credit card number onaccounts that are automatically charged to the credit card.

SUMMARY

The present disclosure broadly discloses methods, computer-readablemedia, and systems for protecting user data during audio interactions.In one example, a method performed by a processing system includesdetecting an audio signal that is part of an interaction between a userand another party, converting the audio signal into a string of text,detecting that the interaction is likely to put sensitive data of theuser at risk, based on a comparison of the string of text to a libraryof interactions that are known to put sensitive data at risk, andsending, in response to detecting that the interaction is likely to putthe sensitive data of the user at risk, an alert to notify the user thatthe interaction is likely to put the sensitive data of the user at risk,wherein the alert is sent to prevent the user from providing thesensitive data to the another party, and wherein the method is performedcontemporaneously with an occurrence of the interaction.

In another example, a non-transitory computer-readable medium may storeinstructions which, when executed by a processing system in acommunications network, cause the processing system to performoperations. The operations may include detecting an audio signal that ispart of an interaction between a user and another party, converting theaudio signal into a string of text, detecting that the interaction islikely to put sensitive data of the user at risk, based on a comparisonof the string of text to a library of interactions that are known to putsensitive data at risk, and sending, in response to detecting that theinteraction is likely to put the sensitive data of the user at risk, analert to notify the user that the interaction is likely to put thesensitive data of the user at risk, wherein the alert is sent to preventthe user from providing the sensitive data to the another party, andwherein the method is performed contemporaneously with an occurrence ofthe interaction.

In another example, a device may include a processing system includingat least one processor and a non-transitory computer-readable mediumstoring instructions which, when executed by the processing system whendeployed in a communications network, cause the processing system toperform operations. The operations may include detecting an audio signalthat is part of an interaction between a user and another party,converting the audio signal into a string of text, detecting that theinteraction is likely to put sensitive data of the user at risk, basedon a comparison of the string of text to a library of interactions thatare known to put sensitive data at risk, and sending, in response todetecting that the interaction is likely to put the sensitive data ofthe user at risk, an alert to notify the user that the interaction islikely to put the sensitive data of the user at risk, wherein the alertis sent to prevent the user from providing the sensitive data to theanother party, and wherein the method is performed contemporaneouslywith an occurrence of the interaction.

BRIEF DESCRIPTION OF THE DRAWINGS

The teachings of the present disclosure can be readily understood byconsidering the following detailed description in conjunction with theaccompanying drawings, in which:

FIG. 1 illustrates an example system in which examples of the presentdisclosure for protecting user data during audio interactions mayoperate;

FIG. 2 illustrates a flowchart of an example method for protecting userdata during audio interactions, in accordance with the presentdisclosure; and

FIG. 3 illustrates an example of a computing device, or computingsystem, specifically programmed to perform the steps, functions, blocks,and/or operations described herein.

To facilitate understanding, similar reference numerals have been used,where possible, to designate elements that are common to the figures.

DETAILED DESCRIPTION

The present disclosure broadly discloses methods, computer-readablemedia, and systems for protecting user data during audio interactions.As discussed above, fraud costs consumers billions of dollars each year,collectively. Moreover, an individual victim of fraud may spend muchtime trying to repair the non-financial damage of the fraud, such asreplacing credentials and equipment, resetting access to accounts, andthe like. As the individuals perpetrating the fraud become more creativewith their approaches and their uses of technology, it becomes moredifficult, particularly for less technologically savvy consumers, totell whether an individual requesting sensitive or personal or financialdata has a legitimate need for the data or is trying to obtain the datafor fraudulent purposes.

For instance, in one scam that is becoming alarmingly common, anindividual may receive a phone call from a caller claiming to be theindividual's grandchild (or a person who is allegedly in contact withthe individual's grandchild). The “grandchild” may claim that there isan emergency (e.g., their car broke down, or they are in the hospital,or they are stranded in a foreign country), and may ask the individualto immediately send a large sum of cash via a money transfer service. Inreality, however, the individual's grandchild may be perfectly fine, andthe caller may be a stranger attempting to coerce money out of theindividual.

In another common scam, an individual may receive a phone call from acaller claiming to be from the Internal Revenue Service or from aservice provider such as the electric company or the water company. Thecaller may claim that the individual owes back taxes or is delinquent ona bill and subjected to immediate loss of utility services. The callermay ask that the individual purchase a gift card as a means of payment,and provide the gift card number and pin to the caller.

Although the above described scams are well known, they managed to foollarge numbers of people, including elderly people, before they becamewidely known. Moreover, even though most people now know to be cautiouswhen receiving a phone call such as one of the calls described above,many people are still unaware of the dangers posed by the calls.Additionally, some callers may rely on the element of surprise in orderto fool an individual when his or her guard may be down (e.g., such ascalling in the middle of the night).

In another more dangerous scam, an individual may actually physicallyarrive at the potential victim's home, work or car and requestinformation from the potential victim. In certain scenarios, theindividual may even request that the potential victim performs aspecific action instead of just providing verbal sensitive information.For example, the individual may ask the potential victim to open a doorto a home or a car door, to provide a physical key to a door or a safedeposit box located in a bank, or to provide physical cash on the spot,and the like. For example, the scammer may pose as a bank employee whois delivering replacement keys for safe deposit boxes that have recentlyreceived new installed locks. In turn, the scammer may now require thepotential victim to surrender his or her “old” safe deposit keys.

Examples of the present disclosure function as a technologically savvythird party in audio interactions between a user and another party. Forinstance, in one example, a processing system may listen in on an audiointeraction (e.g., a phone call or an in-person conversation) between auser and another party. When the processing system detects a word orphrase in the audio interaction that may signal a risk to the user'ssensitive data (e.g., personal and/or financial information) orrequiring the user to perform a risky action (e.g., surrendering his orher safe deposit box key), the processing system may alert the user tothe fact that his or her sensitive data (or in one embodiment his or heraction) may be at risk (e.g., by activating a visible, audible, and/ortactile alert), thereby minimizing the chance that the user will revealthe sensitive data to the other party (or perform the risky action asrequested by the other party). For example, personal information maycomprise a social security number, a passport number, a driver licensenumber, birthday information, place of birth information, prioremployment information, family relationship information and the like.For example, financial information may comprise bank account numbers,pin numbers, cash amounts in the bank accounts, safe deposit box number,the name of banking institutions holding the user's funds, passwords onbank accounts, user log in names to bank websites, and the like. Infurther examples, the processing system may interrupt the interaction toask the other party questions, to answer on behalf of the user, and/orto direct the other party to another individual (such as the user'sfamily member or caregiver). Thus, examples of the present disclosuremay act on the user's behalf in order to protect the user's sensitivedata from other parties who may try to obtain the sensitive data (e.g.,through unsolicited interactions initiated by the other parties). Theseand other aspects of the present disclosure are discussed in greaterdetail below in connection with the examples of FIGS. 1-3.

To further aid in understanding the present disclosure, FIG. 1illustrates an example system 100 in which examples of the presentdisclosure for protecting user data and/or preventing the performance ofa risky action during audio interactions may operate. The system 100 mayinclude any one or more types of communication networks, such as atraditional circuit switched network (e.g., a public switched telephonenetwork (PSTN)) or a packet network such as an Internet Protocol (IP)network (e.g., an IP Multimedia Subsystem (IMS) network), anasynchronous transfer mode (ATM) network, a wired network, a wirelessnetwork, and/or a cellular network (e.g., 2G-5G, a long term evolution(LTE) network, and the like) related to the current disclosure. Itshould be noted that an IP network is broadly defined as a network thatuses Internet Protocol to exchange data packets. Additional example IPnetworks include Voice over IP (VoIP) networks, Service over IP (SoIP)networks, the World Wide Web, and the like.

In one example, the system 100 may comprise a core network 102. The corenetwork 102 may be in communication with one or more access networks 120and 122, and with the Internet 124. In one example, the core network 102may functionally comprise a fixed mobile convergence (FMC) network,e.g., an IP Multimedia Subsystem (IMS) network. In addition, the corenetwork 102 may functionally comprise a telephony network, e.g., anInternet Protocol/Multi-Protocol Label Switching (IP/MPLS) backbonenetwork utilizing Session Initiation Protocol (SIP) for circuit-switchedand Voice over Internet Protocol (VoIP) telephony services. In oneexample, the core network 102 may include at least one applicationserver (AS) 104 and at least one database (DBs) 106. For ease ofillustration, various additional elements of the core network 102 areomitted from FIG. 1.

In one example, the access networks 120 and 122 may comprise DigitalSubscriber Line (DSL) networks, public switched telephone network (PSTN)access networks, broadband cable access networks, Local Area Networks(LANs), wireless access networks (e.g., an IEEE 802.11/Wi-Fi network andthe like), cellular access networks, 3^(rd) party networks, and thelike. For example, the operator of the core network 102 may provide acable television service, an IPTV service, or any other types oftelecommunication services to subscribers via access networks 120 and122. In one example, the access networks 120 and 122 may comprisedifferent types of access networks, may comprise the same type of accessnetwork, or some access networks may be the same type of access networkand other may be different types of access networks. In one example, thecore network 102 may be operated by a telecommunication network serviceprovider (e.g., an Internet service provider, or a service provider whoprovides Internet services in addition to other telecommunicationservices). The core network 102 and the access networks 120 and 122 maybe operated by different service providers, the same service provider ora combination thereof, or the access networks 120 and/or 122 may beoperated by entities having core businesses that are not related totelecommunications services, e.g., corporate, governmental, oreducational institution LANs, and the like.

In one example, the access network 120 may be in communication with oneor more user endpoint devices 108 and 110. Similarly, the access network122 may be in communication with one or more user endpoint devices 112and 114. The access networks 120 and 122 may transmit and receivecommunications between the user endpoint devices 108, 110, 112, and 114,between the user endpoint devices 108, 110, 112, and 114 and the AS 104,between the user endpoint devices 108, 110, 112, 114, the Internet ofThings (IoT) devices 116 and 118, and the AS 104, other components ofthe core network 102, devices reachable via the Internet in general, andso forth.

In one example, each of the user endpoint devices 108, 110, 112, and 114may comprise any single device or combination of devices that maycomprise a user endpoint device. For example, the user endpoint devices108, 110, 112, and 114 may each comprise a mobile device, a cellularsmart phone, a gaming console, a set top box, a laptop computer, atablet computer, a desktop computer, a wearable smart device (e.g., asmart watch, smart glasses, or a fitness tracker) an application server,a bank or cluster of such devices, and the like.

The access networks 120 and 122 may also be in communication with one ormore Internet of Things (IoT) devices 116 and 118, respectively. The IoTdevices 116 and 118 may comprise wired or wireless devices that areinstalled in a user's home or business. The IoT devices 116 and 118 maybe controlled, via a controller, a mobile device, a computer, or thelike, to control one or more systems in the user's home or business. Forinstance, the IoT devices 116 and 118 may comprise alarm systems, smartthermostats, doorbells including cameras, smart lighting systems,virtual assistants, smart audio systems, and/or other types of devices.

In accordance with the present disclosure, the AS 104 may be configuredto provide one or more operations or functions in connection withexamples of the present disclosure for protecting user data during audiointeractions, as described herein. The AS 104 may comprise one or morephysical devices, e.g., one or more computing systems or servers, suchas computing system 300 depicted in FIG. 3, and may be configured asdescribed below to protect user data during audio interactions. Itshould be noted that as used herein, the terms “configure,” and“reconfigure” may refer to programming or loading a processing systemwith computer-readable/computer-executable instructions, code, and/orprograms, e.g., in a distributed or non-distributed memory, which whenexecuted by a processor, or processors, of the processing system withina same device or within distributed devices, may cause the processingsystem to perform various functions. Such terms may also encompassproviding variables, data values, tables, objects, or other datastructures or the like which may cause a processing system executingcomputer-readable instructions, code, and/or programs to functiondifferently depending upon the values of the variables or other datastructures that are provided. As referred to herein a “processingsystem” may comprise a computing device including one or moreprocessors, or cores (e.g., as illustrated in FIG. 3 and discussedbelow) or multiple computing devices collectively configured to performvarious steps, functions, and/or operations in accordance with thepresent disclosure.

In one example, the AS 104 may be configured to protect user data and/orpreventing the performance of a risky action during audio interactions.As discussed above, an audio interaction may comprise a phone call or anin-person conversation in which a user is involved. The AS 104 maycooperate with a user endpoint device of the user, such as one of theuser endpoint devices 108, 110, 112, or 114 or one of the IoT devices116 or 118 (e.g., a phone via which the user is having a phone call, ora virtual assistant in a room in which the user is having aconversation) in order to protect the user's sensitive data and/orpreventing the performance of a risky action during the audiointeraction. For instance, the user endpoint device may listen to theaudio interaction and may send all or parts of the interaction to the AS104 for analysis (e.g., to determine whether the audio interaction islikely to put the user's sensitive data at risk and/or to cause theperformance of a risky action by the user). Alternatively, the userendpoint device may send keywords extracted from audio interaction bythe user endpoint device to the AS 104, where the AS 104 may analyze thekeywords for likelihood of risk. In another example, the user endpointdevice may send non-audio data about the audio interaction to the AS 104for analysis, such as a phone number from which a caller is calling, analleged name of the caller, a company that the caller claims torepresent, or the like.

As discussed in further detail below, the AS 104 may analyze data aboutthe audio interaction (e.g., transcripts, keywords, and/or other data)against a library of interactions that are known to be risky (e.g.,likely to be fraudulent, to put sensitive user data at risk, or to causethe performance of a risky action). One specific example of a method forprotecting user data during audio interactions according to the presentdisclosure is described in greater detail in connection with FIG. 2.

The DB 106 may store information about audio interactions that are knownto be fraudulent or risky (e.g., likely to put sensitive data at risk orcause the performance of a risky action). For instance, each entry in adatabase 126 may be associated with a historical instance, or acomposite of historical instances, of an audio interaction (e.g., phonecall or in-person conversation) that was fraudulent or risky. Theinstance of the historical audio interaction may further be associatedwith certain keywords that are most strongly correlated with fraud, ormost strongly correlated with a risk to sensitive data. For instance, inthe example discussed above, where the caller claims to be a relativewho needs money, words or phrases such as “emergency,” “cash,” “moneytransfer,” “wire transfer,” “PIN number,” “social security number,”“bank account number,” “saving account number,” or the like, mayindicate a higher likelihood of risk. Similarly, the instance of thehistorical audio interaction may further be associated with certaininteraction or conversation patterns that are most strongly correlatedwith fraud, or most strongly correlated with a risk to sensitive data orperformance of a risky action, e.g., “give me your safe deposit key,”“give me your house key,” “give me your car key,” “open your door now,”“open your window now,” “step outside now,” “come to the street,” “openyour garage door,” and so on. Following the same example, an interactionpattern that is considered to be risky may include the following eventsin order (possibly with other events occurring in between): (1) thecaller claims to be a relative or in contact with a relative who is introuble, e.g., who is being detained or very sick; (2) the caller statesthat there is an emergency situation (e.g., detainment, hospitalization,etc.); and (3) the caller asks for money to be immediately be sent via amoney transfer to address the emergency situation. Keywords, patterns,and other data directly taken from the audio of an interaction may bereferred to herein as intrinsic data.

The instance of the historical audio interaction may further beassociated with extrinsic data that is most strongly correlated withfraud, or most strongly correlated with a risk to sensitive data. Theextrinsic data may include, for example, the time of day at which thecall is received (e.g., the middle of the night), the phone number fromwhich the call is received (e.g., the phone number may have beeninvolved in multiple instances of the same fraudulent interaction), thecountry, state, or region from which the call originated, the companythe caller alleged to be involved with or is being represented, and/orother data.

In one example, the DB 106 may comprise a physical storage deviceintegrated with the AS 104 (e.g., a database server or a file server),or may be attached or coupled to the AS 104, in accordance with thepresent disclosure. In one example, the AS 104 may load instructionsinto a memory, or one or more distributed memory units, and execute theinstructions for protecting user data and/or preventing the performanceof a risky action during audio interactions, as described herein.

In one example, one or more servers 128 and databases (DBs) 126 may beaccessible to the AS 104 via Internet 124 in general. The servers 128may include Web servers that support physical data interchange withother devices connected to the World Wide Web. For instance, the Webservers may support Web sites for Internet content providers, such associal media providers, ecommerce providers, service providers, newsorganizations, and the like. At least some of these Web sites mayinclude sites via which users may report fraudulent communications(e.g., phone numbers from which fraudulent calls were received, phishingemails, and the like).

In one example, the databases 126 may store information about audiointeractions that are known to be fraudulent or risky (e.g., likely toput sensitive data at risk and/or to cause the performance of a riskyaction). For instance, the databases 126 may contain information that issimilar to the information contained in the DB 106, described above.

It should be noted that the system 100 has been simplified. Thus, thoseskilled in the art will realize that the system 100 may be implementedin a different form than that which is illustrated in FIG. 1, or may beexpanded by including additional endpoint devices, access networks,network elements, application servers, etc. without altering the scopeof the present disclosure. In addition, system 100 may be altered toomit various elements, substitute elements for devices that perform thesame or similar functions, combine elements that are illustrated asseparate devices, and/or implement network elements as functions thatare spread across several devices that operate collectively as therespective network elements.

For example, the system 100 may include other network elements (notshown) such as border elements, routers, switches, policy servers,security devices, gateways, a content distribution network (CDN) and thelike. For example, portions of the core network 102, access networks 120and 122, and/or Internet 124 may comprise a content distribution network(CDN) having ingest servers, edge servers, and the like. Similarly,although only two access networks, 120 and 122 are shown, in otherexamples, access networks 120 and/or 122 may each comprise a pluralityof different access networks that may interface with the core network102 independently or in a chained manner. For example, UE devices 108,110, 112, and 114 and IoT devices 116 and 118 may communicate with thecore network 102 via different access networks, user endpoint devices110 and 112 may communicate with the core network 102 via differentaccess networks, and so forth. Thus, these and other modifications areall contemplated within the scope of the present disclosure.

FIG. 2 illustrates a flowchart of an example method 200 for protectinguser data (and/or preventing the performance of a risky action) duringaudio interactions, in accordance with the present disclosure. In oneexample, steps, functions and/or operations of the method 200 may beperformed by a device as illustrated in FIG. 1, e.g., AS 104, a UE 108,110, 112, or 114, an IoT device 128 or 130, or any one or morecomponents thereof. In one example, the steps, functions, or operationsof the method 200 may be performed by a computing device or system 300,and/or a processing system 302 as described in connection with FIG. 3below. For instance, the computing device 300 may represent at least aportion of the AS 104 in accordance with the present disclosure. Forillustrative purposes, the method 200 is described in greater detailbelow in connection with an example performed by a processing system,such as processing system 302. In one example, the method 200 isperformed contemporaneously with the occurrence of an interactionbetween a user (e.g., a person who subscribes to a service that protectsuser data) and another party. In other words, the steps of the method200 may be performed in real time, as the interaction is occurring.

The method 200 begins in step 202 and proceeds to step 204. In step 204,the processing system may detect an audio signal that is part of aninteraction between a user and another party. For instance, theinteraction may comprise a phone call (e.g., a video call or anaudio-only call), where the phone call may have been initiated either bythe user (e.g., a call by the user to a customer service number) or bythe other party (e.g., an unsolicited call, a call from a purportedservice provider, etc.). In this case, the other party may be a humanentity or a non-human entity (e.g., a bot or the like), where the humanor non-human nature may be unclear or unknown to the user.

However, in another example, the interaction may comprise an in-personinteraction between the user and the other party (e.g., an interactionwith a cashier at a store, with a solicitor inside or outside the user'shome, with a stranger who has approached the user in public, etc.). Inthis case, the other party is most likely a human entity.

In one example, the interaction is initiated by the other party ratherthan by the user. That is, the other party may approach the user,unsolicited and/or under false pretenses, to initiate the interaction.As described in further detail below, the method 200 may operate onbehalf of the user to protect the user against any breaches of privacy(and/or preventing the performance of a risky action) that may resultfrom the unsolicited interaction.

In one example, the processing system executes a program in thebackground of a device of which the processing system is a part (e.g.,the user's mobile phone or wearable smart device). The program may“listen” continuously for audio signals. In one example, the program maylisten specifically for audio signals in which the user's voice can bedetected (where one or more speech or audio processing techniques may beused to determine whether the audio signal includes utterances made bythe user, which may indicate that the user is involved in a verbalinteraction with another party).

In step 206, the processing system may convert at least a portion of theaudio signal into a string of text (e.g., a string of words or phrases).For instance, the processing system may utilize one or more speech totext conversion techniques to generate a transcript of the interaction,where the transcript may comprise a plurality of strings of text. Astring of text in this context may represent all or part of an utterancemade by the user and/or the other party (e.g., one or more words utteredby the user and/or the other party).

In step 208, the processing system may detect that the interaction islikely to put sensitive data of the user at risk (and/or to cause theperformance of a risky action), based on a comparison of the string oftext to a library of interactions that are known to put sensitive dataat risk (and/or to cause the performance of a risky action). Forinstance, as discussed above, the processing system may have access(e.g., directly, or indirectly through an application server or otherdevices) to a database that stores a library of known interactions. Eachknown interaction in the library may comprise a historical observedinstance (or a composite of multiple historical observed instances) inwhich a person's sensitive data was compromised (and/or the performanceof a risky action). The string of text may be used as a query to searchthe library for any entries (known interactions) which share somethreshold similarity with the string of text.

In one example, the interaction may be considered likely to put thesensitive data of the user at risk (and/or to cause the performance of arisky action) when a level of match between the string of text and anentry in the library of known interactions at least meets a threshold(e.g., at least a threshold number of words occur in both the string oftext and the entry). For instance, a word in the string of text may bematched to a keyword that is associated with an interaction in thelibrary. As an example, certain words, phrases, or topics may occurfrequently in phishing scams and other types of risky or fraudulentinteractions. These words, phrases, or topics may include, for instance,personal identifying terms (e.g., “social security number,” “maidenname,” “birth date,” etc.), financial terms (e.g., “credit card,” “bankaccount,” “money order,” etc.), security-related terms (e.g., “pinnumber,” “password,” etc.), or terms that pressure the user to actquickly (e.g., “emergency,” “urgent,” “time is of the essence,” etc.).

In one example, an interaction may be determined to be likely to putsensitive data of the user at risk (and/or to cause the performance of arisky action) when at least a threshold number or a threshold percentageof words, phrases, or topics occurring in the string of text also occurin or are associated with an interaction in the library (e.g., at leastx percent of the words in the string of text also occur in the samesingle interaction in the library, e.g., greater than 70%, 80%, or 90%).In one example, various sliding levels of risk may be associated withdifferent thresholds. For instance, when a number or percentage ofmatching words, phrases of topics is below a first threshold, theinteraction may be quantified as “low risk.” When the number orpercentage of matching words, phrases of topics is above the firstthreshold, but below a second threshold that is higher than the firstthreshold, the interaction may be quantified as “moderate risk.” Whenthe number or percentage of matching words, phrases of topics is abovethe second threshold, but below a third threshold that is higher thanthe second threshold, the interaction may be quantified as “high risk.”When the number or percentage of matching words, phrases of topics isabove the third threshold, the interaction may be quantified as“extremely high risk.”

In another example, the presence of certain specific keywords in thestring of text may trigger a determination that an interaction is risky,regardless of the total number or percentage of matching words in thestring of text as a whole. For instance, if the text string contains arequest for a user's social security number or the surrender of a safedeposit key, that fact alone may be enough to result in a determinationthat the interaction is likely to put sensitive data of the user at risk(and/or to cause the performance of a risky action).

In another example, an interaction may be determined to be likely to putsensitive data of the user at risk (and/or to cause the performance of arisky action) when a pattern of the interaction between the user and theother individual, as reflected in the string of text, matches a patternof an interaction in the library. For instance, some known types ofrisky or fraudulent interactions follow similar patterns. As an example,an interaction may begin with a stranger asking to buy some item that auser has posted for sale on a Web site. The stranger may continue byoffering to send the user a money order or cashier's check for more thanthe asking price of the item. The stranger may next ask that,immediately after the user deposits the money order or cashier's check,the user wire the difference between the amount sent and the askingprice of the item back to the stranger. As another example, a strangermay notify the user that the user is a beneficiary of a will written byan individual unknown to the user. The stranger may ask for the user'sbank account number in order to make a monetary transfer to the user, ormay ask for the user's social security number as a way to allegedlyverify the user's identity. As another example, the stranger may arriveat the user's door informing him or her that the bank has sent thestranger to collect “old” safe deposit box keys as a service to bankconsumers given that new locks having been recently installed in all ofthe safe deposit boxes in the particular bank.

In a further example, the interaction may be considered likely to putthe sensitive data of the user at risk (and/or to potentially cause theperformance of a risky action) if when a level of match between thestring of text and an entry in the library of known interactions atleast meets a threshold, plus some other risk factors are present. Theother risk factors may include extrinsic data about the interaction(whereas the contents of the audio interaction, e.g., the utterancesspoken, may comprise intrinsic data about the interaction). The otherrisk factor may comprise, for instance, an inability of the other partyto respond or to respond satisfactorily to a challenge that is designedto verify the other party's identity. For instance, upon determiningthat the interaction is likely to put the sensitive data of the user atrisk (and/or to cause the performance of a risky action), the processingsystem may send a code, a captcha, or the like to a phone number fromwhich the other party is calling (or to an email address associated withan organization from which the other party is allegedly calling). Theprocessor may then ask the other party to provide the code, captcha, orthe like that the processor sends. If the other party is unable toprovide the code, captcha, or the like, then this may indicate that theother party is not who they say they are. Alternatively or in addition,the processing system may request the other party's name, the otherparty's title, the other party's supervisor's name, or other informationthat may be cross referenced against some data source (e.g., a socialmedia page or a web site of the company from which the other party isalleged to be calling) to verify the information's authenticity.

Failure to respond correctly to the challenge might also indicate thatthe other party is non-human, such as a bot. For instance, manyphone-based fraud schemes may utilize an automatic dialer or similarsystem to place phone calls. Upon reaching an individual who answerstheir phone and/or provides some preliminary amount of information, thecall could subsequently be handed off to a human party who attempts toextract further information from the user.

Alternatively or in addition, the other risk factor may comprise anon-audio risk factor, such as whether a phone number from which theother party is calling is known (e.g., not hidden or showing as“unavailable” or “out of area”) or is known to be associated with riskyinteractions (e.g., is known to be associated with previous phone callsthat put sensitive data of other people at risk, based on theinformation in the library of interactions). In another example, thephone number from which the other party is calling may fail to match aphone number of an entity that the other user is alleged to represent.For instance, the other party may say that he is calling from XYZ Bank.However, the processing system may, upon looking up a phone number forXYZ Bank, determine that the number from which the other party iscalling (e.g., a number showing on the caller ID of the user's mobilephone) does not match the phone number for XYZ bank. As another example,the other party may say that he is calling in regards to an account thatthe user has with some institution (e.g., a bank, a telecommunicationsservice provider, a social media site, etc.). However, the processingsystem may, upon reviewing the user's email, banking records, socialmedia history, or other records, determine that no such account exists.

In step 210, the processing system may send, in response to detectingthat the interaction is likely to put the sensitive data of the user atrisk, an alert to notify the user that the interaction is likely to putthe sensitive data of the user at risk (and/or to cause the performanceof a risky action), wherein the alert is sent to prevent the user fromproviding the sensitive data to the other party or from performing therisky action. In other words, the alert is timed to occur before theuser can reveal the sensitive data or perform the risky action. Forinstance if the other party has asked for the user's social securitynumber, the alert may be generated before the user can provide thesocial security number, thereby preserving the safety of the sensitivedata.

In one example, the alert may comprise a visual alert (e.g., a flashinglight or a text message), and audible alert (e.g., a beep or apre-recorded verbal message), or a tactile alert (e.g., a vibration orrumble). In one example, the alert may be generated by the same deviceor which the processing system is a part. For instance, if theprocessing system is part of the user's mobile phone, the processingsystem may send a signal to the mobile phone's speaker to generate anaudible alert. In another example, the alert may be generated by anotherdevice that is in the user's vicinity (and within communication range ofthe device of which the processing system is a part), such as anInternet of Things device. For instance, the processing system may senda signal to an alarm panel in the user's home that causes a light on thealarm panel to flash or a speaker to sound an alert. By generating thealert from a device that is separate from the processing system, thealert may be less easily detected by the other party.

In another example, the alert may include the processing systeminterrupting the interaction in order to pause the interaction or totake over control of the interaction. For instance, the processingsystem may mute the user's side of a telephone conversation or place theother party on hold on the call before generating an alert (e.g., anaudible, visible, and/or tactile alert) to let the user known that hisor her sensitive data is likely to be at risk. In another instance, thehome alarm panel may broadcast a request from a speaker for the user tostep away from the house door and return to the kitchen area, thebedroom area, and so on. Pausing the interaction in this way may allowthe processing system to discreetly provide more information about theinteraction to the user (e.g., provide a link to a web site explaining acommon scam call that shares similarities with the interaction). Thepause may also provide the user with some time to think about theinteraction and any information he or she may be asked to provide to theother party, which may be useful if the other party is using pressuretactics in order to convince the user to provide the information (e.g.,claiming an emergency situation). The pause may also deter the otherparty from continuing the interaction (e.g., if the other party isindeed a party who is trying to defraud the user, the pause may causethe other party to worry that the fraud will be discovered or iscurrently being discovered). For example, the other party may leave theuser's premises immediately or depart from the house door.

In another example, the processing system may request permission fromthe user to conduct the remainder of the interaction on the user'sbehalf. If permission is received from the user, then, in response, theprocessing system may interact with the other party on the user'sbehalf. For instance, the processing system may utilize one or morespeech synthesis and/or machine learning techniques in order to conductthe user's side of the interaction (e.g., similar to an interactivevoice response system). When the processing system takes over theinteraction in this way, the processing system may be able to preventthe user from further disclosing sensitive information to the otherparty and/or to decline performance of a risky action. The other partymay also be deterred from continuing the interaction.

In another example, the alert may include a suggestion to move theinteraction to another location in which the sensitive data of the usermay be less at risk. For instance, if the other party approached theuser in person (e.g., at home or in a store), the processing system maysuggest that the interaction be moved somewhere else, possibly whereother individuals may be able to observe the interaction (e.g., outsideof the user's house or near a security station). In another example, thealert may include a suggestion to consult with another individual who isknown to the user (e.g., a family member or caregiver) prior tocontinuing the interaction.

The method 200 may end in step 212. Thus, examples of the method 200 maybe able to monitor a user's audio interactions with other parties (e.g.,in-person and/or phone conversations) for events that may signify thepossible disclosure of sensitive user data (and/or to cause theperformance of a risky action). The examples of the method 200 may beable to quantify a risk associated with the disclosure of the sensitiveuser data, e.g., by identifying a likelihood that the other party isrequesting the sensitive user data for illegitimate or fraudulentpurposes (and/or to cause the performance of a risky action). Where thedisclosure of the sensitive user data (and/or the performance of a riskyaction) is determined to be high-risk (e.g., likely to result in fraudupon the user), the examples of the method 200 may be able to prevent orminimize the disclosure of sensitive user data (and/or the performanceof a risky action) by warning the user.

For instance, a user may receive a call where the caller claims to be arepresentative of the user's bank, and the caller may ask the user toverify some item of sensitive data, such as the user's social securitynumber. Upon detecting the request for the social security number,examples of the method 200 may take a number of actions in order todetermine a likelihood that the caller is really from the user's bank.For instance, the processing system may interrupt the conversation andask the caller one or more questions in order to verify the caller'sidentity. The processing system may ask for the caller's name or thebranch location of the bank from which the caller is calling. Theprocessing system may perform an Internet search on the caller's name,title, and/or supervisor, the bank name, and/or the branch location inorder to verify whether a person with the caller's name works for theuser's bank at the alleged branch location (for instance, theinformation may be verified on the caller's social media accounts or onthe bank's web site). The processing system may also ask the caller toverify the user's current account balance and/or last transaction, whichthe processing system may compare by logging into the bank's web siteusing the user's login and password.

The processing system may also ask the caller to provide a number atwhich to call the caller back, for instance in order to determinewhether a number appearing on the user's caller ID has been spoofed. Theprocessing system could also ask the caller to call the user's familymember or caregiver at a different number in order to continue theconversation. Alternatively, the processing system may automaticallytransfer or reroute the ongoing call to a phone number associated withthe user's family member or caregiver, or may automatically conferencethe phone number associated with the user's family member or caregiverinto the ongoing call. In such a case, the processing system mayautomatically pause or mute the ongoing call until the user's familymember or caregiver is on the line. In this way, the processing systemmay prevent the user from revealing any information and/or taking anyaction until his or her family member or caregiver can be consulted.

The processing system may also perform an Internet search and/or query alibrary of known fraudulent interactions or activities in order to seeif the phone number from which the caller is calling is associated withany known fraudulent interactions or activities. For instance, otherusers who have been contacted from the same number from which the calleris calling may report that the calls were suspicious.

Examples of the present disclosure could also be used to protectsensitive user data in situations in which the user initiates aninteraction. For instance, the user may request that the method 200 belaunched in order to place a phone call to the user's life insurancecompany, where the purpose of the call is to ask a question about theuser's life insurance policy. In this case, the processing system mayinteract with a human operator or with an interactive voice responsesystem that the life insurance company uses to route phone calls. Forinstance, the processing system may answer one or more questionsintended to confirm the user's identity (e.g., name, address, policynumber, etc.) and may flag any questions that seem unusually intrusive(e.g., based on a comparison to questions asked in other similar callswith the same or other users, which may be analyzed using machinelearning techniques to learn what types of questions are to beexpected).

In further examples, the method 200 may be used to protect the useragainst audio interactions which may not necessarily put sensitive dataat risk, but which may nevertheless be undesirable. For instance, theprocessing system may detect that the other party is swearingexcessively or using threatening or aggressive language, and may advisethe user to end the call.

The method 200 therefore provides an improvement to technology designedto protect user privacy by detecting situations in which the user may beasked to provide sensitive data or perform a risky action. Thesituations are detected in real time, i.e., as the situations areoccurring. The method 200 then acts in real time to prevent the userfrom providing the sensitive data or from taking the risky action. Thus,the method 200 provides a more effective way to preserve the safety ofsensitive data than methods which remove sensitive data from recordingsor transcripts of interactions after the fact. In the latter case, thesensitive data has already been revealed, and redacting the sensitivedata from the recordings or transcripts may minimize, but not completelyeliminate, the risk to the sensitive data. Examples of the presentdisclosure go a step further by preventing the sensitive data from beingrevealed in the first place (unless there is a legitimate need for thesensitive data to be revealed).

It should be noted that the method 200 may be expanded to includeadditional steps or may be modified to include additional operationswith respect to the steps outlined above. In addition, although notspecifically specified, one or more steps, functions, or operations ofthe method 200 may include a storing, displaying, and/or outputting stepas required for a particular application. In other words, any data,records, fields, and/or intermediate results discussed in the method canbe stored, displayed, and/or outputted either on the device executingthe method or to another device, as required for a particularapplication. Furthermore, steps, blocks, functions or operations in FIG.2 that recite a determining operation or involve a decision do notnecessarily require that both branches of the determining operation bepracticed. In other words, one of the branches of the determiningoperation can be deemed as an optional step. Furthermore, steps, blocks,functions or operations of the above described method can be combined,separated, and/or performed in a different order from that describedabove, without departing from the examples of the present disclosure.

FIG. 3 depicts a high-level block diagram of a computing device orprocessing system specifically programmed to perform the functionsdescribed herein. As depicted in FIG. 3, the processing system 300comprises one or more hardware processor elements 302 (e.g., a centralprocessing unit (CPU), a microprocessor, or a multi-core processor), amemory 304 (e.g., random access memory (RAM) and/or read only memory(ROM)), a module 305 for protecting user data (and/or to prevent theperformance of a risky action) during audio interactions, and variousinput/output devices 306 (e.g., storage devices, including but notlimited to, a tape drive, a floppy drive, a hard disk drive or a compactdisk drive, a receiver, a transmitter, a speaker, a display, a speechsynthesizer, an output port, an input port and a user input device (suchas a keyboard, a keypad, a mouse, a microphone and the like)). Althoughonly one processor element is shown, it should be noted that thecomputing device may employ a plurality of processor elements.Furthermore, although only one computing device is shown in the figure,if the method 200 as discussed above is implemented in a distributed orparallel manner for a particular illustrative example, i.e., the stepsof the above method 200 or the entire method 200 is implemented acrossmultiple or parallel computing devices, e.g., a processing system, thenthe computing device of this figure is intended to represent each ofthose multiple computing devices.

Furthermore, one or more hardware processors can be utilized insupporting a virtualized or shared computing environment. Thevirtualized computing environment may support one or more virtualmachines representing computers, servers, or other computing devices. Insuch virtualized virtual machines, hardware components such as hardwareprocessors and computer-readable storage devices may be virtualized orlogically represented. The hardware processor 302 can also be configuredor programmed to cause other devices to perform one or more operationsas discussed above. In other words, the hardware processor 302 may servethe function of a central controller directing other devices to performthe one or more operations as discussed above.

It should be noted that the present disclosure can be implemented insoftware and/or in a combination of software and hardware, e.g., usingapplication specific integrated circuits (ASIC), a programmable gatearray (PGA) including a Field PGA, or a state machine deployed on ahardware device, a computing device or any other hardware equivalents,e.g., computer readable instructions pertaining to the method discussedabove can be used to configure a hardware processor to perform thesteps, functions and/or operations of the above disclosed method 200. Inone example, instructions and data for the present module or process 305for protecting user data (and/or to prevent the performance of a riskyaction) during audio interactions (e.g., a software program comprisingcomputer-executable instructions) can be loaded into memory 304 andexecuted by hardware processor element 302 to implement the steps,functions, or operations as discussed above in connection with theillustrative method 200. Furthermore, when a hardware processor executesinstructions to perform “operations,” this could include the hardwareprocessor performing the operations directly and/or facilitating,directing, or cooperating with another hardware device or component(e.g., a co-processor and the like) to perform the operations.

The processor executing the computer readable or software instructionsrelating to the above described method can be perceived as a programmedprocessor or a specialized processor. As such, the present module 305for protecting user data (and/or to prevent the performance of a riskyaction) during audio interactions (including associated data structures)of the present disclosure can be stored on a tangible or physical(broadly non-transitory) computer-readable storage device or medium,e.g., volatile memory, non-volatile memory, ROM memory, RAM memory,magnetic or optical drive, device or diskette, and the like.Furthermore, a “tangible” computer-readable storage device or mediumcomprises a physical device, a hardware device, or a device that isdiscernible by the touch. More specifically, the computer-readablestorage device may comprise any physical devices that provide theability to store information such as data and/or instructions to beaccessed by a processor or a computing device such as a computer or anapplication server.

While various examples have been described above, it should beunderstood that they have been presented by way of illustration only,and not a limitation. Thus, the breadth and scope of any aspect of thepresent disclosure should not be limited by any of the above-describedexamples, but should be defined only in accordance with the followingclaims and their equivalents.

What is claimed is:
 1. A method comprising: detecting, by a processingsystem including at least one processor, an audio signal that is part ofan in-person interaction between a user and another party, wherein thein-person interaction is unsolicited by the user; converting, by theprocessing system, the audio signal into a string of text; detecting, bythe processing system, that the in-person interaction is likely to putsensitive data of the user at risk, based on a comparison of the stringof text to a library of known interactions that are known to putsensitive data at risk; and sending, by the processing system inresponse to the detecting that the in-person interaction is likely toput the sensitive data of the user at risk, an alert to notify the userthat the in-person interaction is likely to put the sensitive data ofthe user at risk, wherein the alert is timed to be sent before the userprovides the sensitive data to the another party, and wherein the alertcomprises at least one of: a suggestion to conference via a telephonecall another individual who is known to the user whose sensitive data islikely to be put at risk before continuing the in-person interaction ora suggestion to move the in-person interaction to another location,wherein the method is performed contemporaneously with an occurrence ofthe in-person interaction.
 2. The method of claim 1, where theprocessing system listens continuously for audio signals including theaudio signal.
 3. The method of claim 1, wherein the detecting that thein-person interaction is likely to put the sensitive data of the user atrisk comprises: matching, by the processing system, a word in the stringof text to a keyword that is associated with an interaction in thelibrary of known interactions.
 4. The method of claim 1, wherein thedetecting that the in-person interaction is likely to put the sensitivedata of the user at risk comprises: matching, by the processing system,a pattern of the interaction between the user and the another party to apattern of an interaction in the library of known interactions.
 5. Themethod of claim 1, wherein the detecting is further based on acharacteristic of the in-person interaction that occurs outside of theaudio signal, and wherein the characteristic comprises a characteristicthat is known to be associated with interactions that are known to putsensitive data at risk.
 6. The method of claim 5, wherein thecharacteristic is a name of the another party.
 7. The method of claim 5,wherein the characteristic is an identification of the another party. 8.The method of claim 5, wherein the characteristic is a time of day ofthe in-person interaction.
 9. The method of claim 1, further comprising:placing, by the processing system, the telephone call to the anotherindividual prior to the sending.
 10. The method of claim 1, furthercomprising: issuing, by the processing system, a challenge to theanother party, wherein the challenge is designed to verify an identityof the another party, and wherein the challenge comprises comparinginformation provided by the another party to information retrieved bythe processing system from the Internet.
 11. The method of claim 1,further comprising: requesting, by the processing system, a permissionfrom the user to conduct the in-person interaction on behalf of theuser; and placing, by the processing system, the telephone call directlyto the another individual in response to receiving the permission fromthe user to conduct the in-person interaction on behalf of the user. 12.The method of claim 1, wherein the alert comprises at least one of: avisible alert, an audible alert, or a tactile alert.
 13. The method ofclaim 1, wherein the processing system is part of a mobile phone of theuser.
 14. The method of claim 1, further comprising: informing, by theprocessing system, the user during the in-person interaction to remindquiet until the telephone call is placed with the another individual.15. The method of claim 1, wherein the another individual is a familymember or a caregiver of the user whose sensitive data is likely to beput at risk.
 16. A non-transitory computer-readable medium storinginstructions which, when executed by a processing system including atleast one processor, cause the processing system to perform operations,the operations comprising: detecting an audio signal that is part of anin-person interaction between a user and another party, wherein thein-person interaction is unsolicited by the user; converting the audiosignal into a string of text; detecting that the in-person interactionis likely to put sensitive data of the user at risk, based on acomparison of the string of text to a library of known interactions thatare known to put sensitive data at risk; and sending, in response to thedetecting that the in-person interaction is likely to put the sensitivedata of the user at risk, an alert to notify the user that the in-personinteraction is likely to put the sensitive data of the user at risk,wherein the alert is timed to be sent before the user provides thesensitive data to the another party, and wherein the alert comprises atleast one of: a suggestion to conference via a telephone call anotherindividual who is known to the user whose sensitive data is likely to beput at risk before continuing the in-person interaction or a suggestionto move the in-person interaction to another location, wherein themethod is performed contemporaneously with an occurrence of thein-person interaction.
 17. The non-transitory computer-readable mediumof claim 16, wherein the detecting that the in-person interaction islikely to put the sensitive data of the user at risk comprises: matchinga word in the string of text to a keyword that is associated with aninteraction in the library of known interactions.
 18. The non-transitorycomputer-readable medium of claim 16, wherein the detecting that thein-person interaction is likely to put the sensitive data of the user atrisk comprises: matching a pattern of the interaction between the userand the another party to a pattern of an interaction in the library ofknown interactions.
 19. The non-transitory computer-readable medium ofclaim 16, wherein the detecting is further based on a characteristic ofthe in-person interaction that occurs outside of the audio signal, andwherein the characteristic comprises a characteristic that is known tobe associated with interactions that are known to put sensitive data atrisk.
 20. A device comprising: a processing system including at leastone processor; and a non-transitory computer-readable medium storinginstructions which, when executed by the processing system, cause theprocessing system to perform operations, the operations comprising:detecting an audio signal that is part of an in-person interactionbetween a user and another party, wherein the in-person interaction isunsolicited by the user; converting the audio signal into a string oftext; detecting that the in-person interaction is likely to putsensitive data of the user at risk, based on a comparison of the stringof text to a library of known interactions that are known to putsensitive data at risk; and sending, in response to the detecting thatthe in-person interaction is likely to put the sensitive data of theuser at risk, an alert to notify the user that the in-person interactionis likely to put the sensitive data of the user at risk, wherein thealert is timed to be sent before the user provides the sensitive data tothe another party, and wherein the alert comprises at least one of: asuggestion to conference via a telephone call another individual who isknown to the user whose sensitive data is likely to be put at riskbefore continuing the in-person interaction or a suggestion to move thein-person interaction to another location, wherein the method isperformed contemporaneously with an occurrence of the in-personinteraction.